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Abstract 

(N 

In this paper, we propose a stochastic model to describe how search service providers charge client 
companies based on users' queries for the keywords related to these companies' ads by using certain 
advertisement assignment strategies. We formulate an optimization problem to maximize the long-term 
average revenue for the service provider under each client's long-term average budget constraint, and 
design an online algorithm which captures the stochastic properties of users' queries and click-through 
behaviors. We solve the optimization problem by making connections to scheduling problems in wireless 
networks, queueing theory and stochastic networks. Unlike prior models, we do not assume that the 
£^ , number of query arrivals is known. Due to the stochastic nature of the arrival process considered here, 

^ ■ either temporary "free" service, i.e., service above the specified budget (which we call "overdraft") or 

(j \ under-utilization of the budget (which we call "underdraft") is unavoidable. We prove that our online 

algorithm can achieve a revenue that is within 0(e) of the optimal revenue while ensuring that the 
overdraft or underdraft is 0(l/e), where e can be arbitrarily small. With a view towards practice, we 
can show that one can always operate strictly under the budget. In addition, we extend our results to 
a click-through rate maximization model, and also show how our algorithm can be modified to handle 
f"-^ , non-stationary query arrival processes and clients with short-term contracts. 

Our algorithm also allows us to quantify the effect of errors in click-through rate estimation on the 
achieved revenue. We show that we lose at most fraction of the revenue if A is the relative error 
Q\ ■ in click-through rate estimation. 

We also show that in the long run, an expected overdraft level of f2(log(l/e)) is unavoidable 
(a universal lower bound) under any stationary ad assignment algorithm which achieves a long-term 
average revenue within O(e) of the offline optimum. 
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I. Introduction 

Providing online advertising services has been the major source of revenue for search service 
providers such as Google, Yahoo and Microsoft. When an Internet user queries a keyword, 
alongside the search results, the search engine may also display advertisements from some 
companies which provide services or goods related to this keyword. These companies pay the 
search service providers for posting their ads with a specified amount of price for each ad on a 
pay-per-impression or pay-per-click basis. We call them "clients" in the following text. 

Maximizing the revenue obtained from their clients is the key objective of search service 
providers. Research which targets this objective has followed two major directions. One is based 
on auction theory, in which the goal is to design mechanisms in favour of the service provider, 
and much of the research in this direction considers static bids (e.g. |fT3l ; see IflOl for a survey), 
while dynamic models such the one in Il22ll are still emerging. The other is from the perspective of 
online resource allocation without considering the impact of the service provider's mechanisms 
on the clients' bids, and the main focus of this kind of research is on designing an online 
algorithm which posts specific ads in response to each search query arriving online, in order to 
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achieve a high competitive ratio with respect to the offline optimal revenue. Our work follows 
the second direction. 

Our model is as follows: 



Online Advertising Model: 

Assume that queries for keyword q arrive to the search engine according to a stochastic process 
at rate v q queries per time slot, where we have assumed that time is discrete and a "time slot" is 
our smallest discrete time unit. In response to each query arrival, the search engine may display 
ads from some clients on the webpage. There are L different places (e.g., top, bottom, left, 
right, etc.) on a webpage where ads could be displayed. We will call these places "webpage 
slots." When client i's ad is displayed in webpage slot s when keyword q is queried, there is 
a probability with which the user who is viewing the page (the one who generated the query) 
will click on the ad. This probability, called the "click-through rate," is denoted by c q i S . 

A client specifies the amount of money ("bid") that it is willing to pay to the search service 
provider when a user clicks on its ad related to a specific query. We use r qi to denote this 
per-click payment from client i for its ad related to a query for keyword q. Additionally, client 
i also specifies an average budget b { which is the maximum amount that it is willing to pay 
per "budgeting cycle" on average, where a budgeting cycle equals to N time slots (we have 
introduced the notion of a budgeting cycle since the time-scale over which queries arrive may 
be different than the time- scales over which budgets may be settled). 

The problem faced by the search service provider is then to assign advertisements to webpage 
slots, in response to each query, so that its long-term average revenue is maximized. 



Based on the above model, we design an online algorithm which achieves a long-term average 
revenue within 0(e) of the offline optimal revenue, where e can be chosen arbitrarily small, 
indicating the near-optimality of our online algorithm. Before entering into the details, in the 
next two subsections we will first survey the related literature, highlight the main contributions 
of our work, and discuss the differences between our model and previous ones. 

A. Related Work 

We will only survey the online resource allocation models here, and not the auction models. 
The online ads model in prior literature mainly include two types, namely AdWords (AW) 
and Display Ads (DA), of which the difference lies in the constrained resource of each client. 
In the AW model, the resource is the client's budget, while in the DA model, the resource 
is the maximum number of impressions agreed on by the client and the service provider. 
Correspondingly, after each resource allocation step, the resource of a client whose ad is posted, 
is reduced by the bid valueQ in the AW model, or 1 impression in the DA model. Both of 
them belong to a general class of packing linear programs formulated in ||S). Most of the prior 
online algorithms for solving the AW and DA model respect the hard constraint on the client's 
resources. One exception is [9], where the authors argue that "free disposal" of resources makes 
the DA model more tractable (but not necessary for the AW model). 

Mehta et al. EOll modeled the online ads problem as a generalization of an online matching 
problem lfT6ll on a bipartite graph of queries and clients. Later in 0, Buchbinder et al. showed 
that matching clients to webpage slots (whether it is a single slot or multiple slots) can be solved 

'This refers to the pay-per-impression scheme. With a pay-per-click scheme, the reduction only happens if the ad is clicked. 
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as a maximum- weighted matching problem. Following [5], a number of other online algorithms 
using the maximum-weighted bipartite matching idea have been proposed in |fT9l , flU, [0 and 
(HI. The algorithms in [TT5l and ll20ll . which were earlier than J51, can also be regarded as 
maximum- weighted matching solutions on this bipartite graph of clients and webpage slots. 

In [fT51 . the "b-matching" problem (related to the online ads context, bids are trivially or 1 
and budgets are all b) is solved by an 1 — 1/e competitive algorithm as b — > oo and the weights 
are the remaining budgets of those clients interested in the newly arrived query (i.e., the bid 
equals 1). For the online ads problem in which bids and budgets can have general and different 
values, ll20l (its longer version is ||2T|0 uses the "discounted" bids as the weights corresponding 
to each client. The discount factor is calculated by a function if){x) = l — e^ 1 , of which the input 
x is the fraction of a client's budget that has been consumed. Their algorithm is also 1 — 1/e 
competitive, under an assumption that bids are small compared to budgets. By taking advantage 
of estimated numbers of query arrivals for each keyword within a given period and modifying 
the discount factor in ll20l . Mahdian et al. Ifl9l designed a class of algorithms which achieve a 
considerably better competitive ratio with accurate estimates while still guarantee a reasonably 
good competitive ratio with inaccurate estimates, also assuming small bids. 

The algorithms in (5J, ||5), (61 , (H and (TJ, all use a primal-dual framework to compute a 
maximum-weighted matching at each iteration, in which the dual variables (corresponding to 
each client) are used to determine the weights. The two 1 — 1/e competitive algorithms in (51 
and update the dual variables dynamically in their primal-dual type algorithms every time 
a decision is made. Specifically, each dual variable in Q, which implicitly tracks the fraction 
of budget that has been spent by the corresponding client, grows during each iteration at a rate 
parameterized by the fraction of the bid for the incoming query in this client's total budget, while 
BH uses an "exponentially weighted average" of the up-to-date n(i) most valuable impressions^ 
assigned to client i as a new dual variable with respect to this client. On the other hand, the three 
dual type learning -based algorithms in [0, (HI and 0] achieve a competitive ratio of 1 — 0(e) 
based on a random-order arrival model (rather than the adversarial model in most of the earlier 
work), assuming small bids and knowledge of the total number of queries. The main difference 
between them is that O and [[HI use an initial e fraction of queries to learn the optimal dual 
variables (with respect to this training set), while the algorithm in [1] repeats the learning process 
over geometrically growing intervals. Additionally, the "small bids" condition in [OQ is slightly 
weaker than the condition in |0 and (HI. 

B. Our Contributions and Comparison to Prior Work 

As in prior work (especially [5] and flU), om solution relies on a primal-dual framework to 
solve a maximum-weighted matching problem on a bipartite graph of clients and webpage slots, 
with dynamically updated dual variables which contribute to the weights on the edges of the 
bipartite graph. However, unlike prior work, we are able to obtain a revenue which is 0(e) close 
to the optimal revenue using a purely adaptive algorithm without the need for the knowledge of 
the number of query arrivals over a time period or the average arrival rates. 

Our solution is related to scheduling problems in wireless networks. In particular, we use 
the optimization decomposition ideas in ifTTl . the stochastic performance bounds in lfT8l and 
the modeling of delay- sensitive flows in [14]. Borrowing from that literature, we introduce 
the concept of an "overdraft" queue. The overdraft queue measures the amount by which the 

2 In the DA model in (5J, n(i) is defined as the maximum number of impressions agreed for client i. After allowing free 
disposal, only the current n(i) most valuable impressions assigned to client i will be considered. 
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provided service temporarily exceeds the budget specified by a client. In making the connection 
to wireless networks, we define something called the "per-client revenue region," which is 
related to the concept of capacity region in queueing networks (see [fTTTl . lfT8l ). In our context, 
it characterizes the revenue extractable from each client as a function of all the clients' budgets. 

Our online algorithm exhibits a trade-off between the revenue obtained by the service provider 
and the level of overdrafts. We can further modify our online algorithm so that clients can always 
operate strictly under their budgets. Finally, our algorithm and analysis naturally allow us to 
assess the impact of click-through rate estimation on the service providers revenue. 

We are able to show that our online algorithm achieves an overdraft level of 0(l/e). So a 
natural question is whether this bound is tight. We show that the overdraft for any algorithm 
must be f2(log(l/e)). While there is a gap between the upper and lower bounds, together they 
imply that the overdraft must increase when e goes to zero. This work is related to J3j, [|25l . 
Il26ll . Il24l and |[T2l in the context of communication networks. See Section [IV] for a detailed 
survey. 

Besides the revenue maximization model, we also study another online ads model in which the 
objective is to maximize the average overall click-through rate, subject to a minimum impression 
requirement for each client. We also show that our results can be naturally extended to handle 
non- stationary query arrival processes and clients which have short-term contracts with the 
service provider. . 

Like the algorithm in 0]], our algorithm can also be generalized to a wider class of linear 
programs within different application contexts, where the coefficients in the objective function 
and constraints are not necessarily nonnegative. 

There are two points of departure in our algorithm compared to existing models: the first one 
is that we assume a purely stochastic model in which the query arrival rates are unknown. Thus, 
there is no need to know the number of arrivals in a time period as in prior models, and this 
is even true for non-stationary query arrival processes. The other is that we assume an average 
budget rather a fixed budget over a time horizon. This allows us to better model permanent clients 
(e.g., big companies who do not stop advertising) and who do not provide a fixed time-horizon 
budget. Clients who advertise for a limited amount of time can also be handled well since the 
algorithm is naturally adaptive. 

A minor difference with respect to prior models is that our model assumes that time is slotted. 
This can be easily modified to assume that query arrivals can occur at any time according to 
some continuous-time stochastic process. The only difference is that our analysis would then 
involve continuous-time Lyapunov drift instead of the discrete-time drift used in this paper. From 
a theoretical point of view, our analysis is different from prior work which uses competitive ratios: 
our model and solution is similar in spirit to stochastic approximation flU where gradients (here 
the gradient of the dual objective) are known only with stochastic perturbations. This point of 
view is essential to model stochastic traffic with unknown statistics. 

Instead of the 1 — 0(e) competitive ratio in prior work, we show that our algorithm achieves 
a revenue which is within 0(e) of the optimal revenue. The 0(e) penalty arises due to the 
stochastic nature of our model. However, we do not require assumptions such as knowledge 
of the total number of queries in a given period [fT9l , [0, (HI, fl], or information of keyword 
frequencies [[T9l PI 

3 It should be mentioned that another common assumption "small bids" (or "large budgets", "large offline optimal value") 
used in 1151 , 1201 , 1191 , (9), HJ and (8j is not essentially different from our "long-term" assumption. 
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C. Organization of the Paper 

The rest of the paper is organized as follows: In Section HH we formulate an optimization 
problem involving long-term averages. In Section Hn] we start considering the stochastic version 
of our model and propose an online algorithm, which also introduces the concept of "overdraft 
queue." Performance analysis of this online algorithm, which includes the near-optimality of 
the long-term revenue and an upper bound on the overdraft level, will also be done in Section 
Hill The last two subsections of Section [III] present two extensions, namely the decisions based 
on estimated click-through rates and the "underdraft" mechanism. In Section [TV] we derive a 
universal lower bound on the expected overdraft level under any stationary algorithms for online 
advertising. The second online ads model "click-through rate maximization problem" with its 
related extensions, algorithm design and analysis is given in Section [V] Section [VI] concludes 
the whole paper. 

Compared to an earlier version of this paper which appeared in [|28l . we give a more detailed 
literature survey in Subsection II-Al all the proofs for the lemmas, theorems and corollaries in 
Section [HI] (we only stated these results without proofs in |[28l due to page limits), and full 
discussions on the underdraft mechanism in Subsection IIII-FI Sections [IV] and |V] are completely 
new. 

II. An Optimization Problem Involving Long-Term Averages 

Based on the model described in Section [U we first pose the revenue maximization problem as 
an optimization problem involving long-term averages. For this purpose, we define an assignment 
of clients to webpage slots as a matrix M of which the (i, s) th element is defined as follows: 

, , J 1, if client i is assigned to webpage slot s 
Mis ~\0, else. 

The matrix M has to satisfy some practical constraints. First, a webpage slot can be assigned to 
only one client and vise versa. Furthermore, the assignment of clients to certain webpage slots 
may be prohibited for certain queries. For example, it may not make sense to advertise chocolates 
when someone is searching for information about treatments for diabetes. These constraints can 
be abstracted as follows: For the queried keyword q, the set of assignment matrices have to 
belong to some set M. q . We also let p M be the probability of choosing matrix M when the 
queried keyword is q. 

The optimization problem is then given by 

max J R(p) = ^ v i X p «« Yl Mis ° 

qis^qi (1) 

q MeMq i,s 

subject to 

N ^2 V * X P q M^2 M isCqi S r q i <b u Vi; (2) 

q MeMq s 

< VqM < 1, Vg, M e M q ; (3) 

MeM q 

In the above formulation, the objective © is the average revenue per time slot and constraint © 
expresses the fact that the average payment over a budgeting cycle should not exceed the average 
budget. The optimization is a linear program and if all the problem parameters are known, in 
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principle, it can be solved offline, returning probabilities {p qM } which can be used by a service 
provider to maximize its revenue. However, such an offline solution is not desirable for at least 
two reasons: 

• Being a static approach, it does not use any feedback about the current state of the system. 
For example, the fact that the empirical average payment of a client has severely exceeded 
its average budget would have no impact on the subsequent assignment strategy. Since 
the formulation and hence, the solution, only cares about long-term budget constraint 
satisfaction, severe overdraft or underdraft of the budget can occur over long periods of 
time. 

• The offline solution is a function of the query arrival rates {v q }. Thus, a change in the 
arrival rates would require a recomputation of the solution. 

In view of these limitations of the offline solution, we propose an online solution which 
adaptively assigns client advertisements to webpage slots to maximize the revenue. As we will 
see, the online solution does use feedback about the overdraft (or underdraft) level in future 
decisions, and does not require knowledge of {y q }- 



III. Online Algorithm and Performance Analysis 

A. A Dual Gradient Descent Solution 

To get some insight into a possible adaptive solution to the problem, we first perform a dual 
decomposition which suggests a gradient solution. However, a direct gradient solution will not 
take into the account the stochastic nature of the problem and will also require knowledge of 
the query arrival rates {v q }. We will address these issues in the following subsections, using 
techniques that, to the best of our knowledge, have not been used in prior literature on the online 
advertising problem. 

We append the constraint © to the objective © using Lagrange multipliers Si > to obtain 
a partial Lagrangian function 

£(P, *) = Y V 1 Y *V Y M isCqisr qi - ^ 5i ■ I X U q ^ PqM Y M is C Qisn ~ ^ 
q MeMq i,s i V q MeMq s 

= Y v i Y p ^Y2 M isVqi{i - Si) -p-, 

q MeMq i,s i 

subject to constraints © and ©. The dual function is 

S b 

D(S) = max J] iSgJ2 P q a^Y M isC q i s r qi {l ~ + Y ~jf> 

q M&Mq i,s i 

subject to constraints © and ©. Note that the maximization part in the dual function can be 
decomposed into independent maximization problems with regard to each queried keyword q, 
i.e., for all q, 

max > p > Mi S c a i S r a Al — 5 A = max > Mi S c i S r oi (l — SA, 

x i M qS MeM q i,s i,s 

where it is easy to see that each maximization is solved by a deterministic solution. This suggests 
the following primal-dual algorithm to iteratively solve the original optimization problem ©: at 
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step k, 



Vg, M*(q,k) e arg max >^M, 



Vz, *i(ifc+l) 



*(*;) + e (JV ^ ^[M*(g,A;)] is 



where e > is a fixed step-size parameter, and [x] + = x if x > or = otherwise. 
Furthermore, defining Qi{k) = 5i(k)/e, the above iterative algorithm becomes 



Vg, 
Vz, 



where 



M* (g, k) G are max > M, 

i,s 

Qi(k+1)= [Qi{k) + Xi{k)-b 



isCqis^qi \ Qi\k) ) j 



qis^qi • 



(5) 



Note that Qi(^) can be interpreted as a queue which has \{k) arrivals and bi departures at 
step k. Although this algorithm already uses the feedback provided by {Q(k)} (or {6(k)}) 
about the state of the system, it is still using a priori information about the arrival rates of 
queries in {X(k)}, hence not really "online." However, it motivates us to incorporate a queueing 
system with stochastic arrivals into the real online algorithm, which will be described in the next 
subsection. 



B. Stochastic Model, Online Algorithm, and "Overdraft Queue " 

In practice, a search service provider may not have a priori information about the query arrival 
rates {vq), and generally, query arrivals during each time slot are stochastic rather than constant. 
Let time slots be indexed by t E Z + U {0}. We specify our detailed statistical assumptions as 
follows: 

• Query arrivals: Assume that a time slot is short enough so that query arrivals in each time 
slot can be modeled as a Bernoulli random variable with occurrence probability v. The 
probability that an arrived query is for keyword q is assumed to be i9 q and Ylq^q = !• 
Let q(t) represent the index of the keyword queried in time slot t, such that q(t) = q w.p. 
u q = vd q for all q (indexed by positive integers) and q(t) = w.p. 1 — v, which accounts 
for the case that no query arrives. 

• Budget spending: We limit the values of budget spent in each budgeting cycle to be integers. 
To match the average budget b L (when it is not an integer), the budget of client i in budgeting 
cycle k is assumed to be a random variable b(k) which equals \bi] w.p. Qi and \pi\ otherwise, 
such that E[b(k)\ = ft \b i '\ + (1 - Qi) [h\ = b h i.e., ^ = ^"i^j = h - [k] . For the trivial 
case that bi is already an integer, we let Qi — 1. 

• Click- through behaviors: In time slot t, after a query for keyword q arrives, if the ad of 
client i is posted on webpage slot s in response to this query, then whether this ad will be 
clicked is modeled as a Bernoulli random variable c qis (t) with occurrence probability c qis . 

We now want to implement the above iterative algorithm online based on this stochastic model. 
According to definition ©, A, includes average query arrivals and click-through choices within 
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N time slots (i.e., one budgeting cycle). Thus, each iteration step in the online algorithm should 
correspond to a budgeting cycle. For convenience, we define 

u(fc) = {q(t), c(t) for kN < t < kN + N - 1} 

as a collection of random variables describing user behaviors (including stochastic query arrivals 
and click- through choices) in budgeting cycle k. The online algorithm is then described as 
follows: 

Online Algorithm: (in each budgeting cycle k > 0) 

In each time slot t e [kN, kN + N — 1], if q{t) > 0, choose the assignment matrix 

M*(t, q(t), Q(k)) e arg max V" M is c mis r mi ( Qi(k)) . (6) 

MeMqi t \ ' \e I 

At the end of budgeting cycle k, for each client i, update 



Qi(k + 1) = Qi(k) + Mk, Q(k), u(k)) - bi(k) 

where 



(7) 



kN+N-l 



A(k, Q(k), u(k)) 4 ?(*), Q(*0)]i< • • r m . (8) 



Here, Ai(k, Q(k), u(k)) represents the revenue obtained by the service provider from client i 
during budgeting cycle k, and recall that bi(k) is a random variable which takes integer values 
whose mean is equal to the average budget per budgeting cycle. 

In this algorithm, client i is associated with a virtual queue Qi (maintained at the search 
service provider). During budgeting cycle k, the amount of money client i is charged by the 
search service provider Ai(k, Q(k), u(k)) is the arrival to this queue, and the average budget per 
budgeting cycle bi is the departure from this queue. Note that if this queue is positive, it means 
that the total value of the real service already provided to the client has temporarily exceeded 
the client's budget, i.e., "free" service has been provided temporarily. Hence, we call this queue 
the "overdraft queue." 

There are two different time scales here. The faster one is a time slot, the smallest time unit 
used to capture user behaviors (including stochastic query arrivals and click-through choices) 
and execute ad-posting strategies. The slower one is a budgeting cycle (equal to N time slots), 
at the end of which the overdraft queues are updated based on the revenue obtained over the 
whole budgeting cycle. 

We make the following assumptions on the above stochastic model: {q(t)} are i.i.d. across 
time slots t; {c qis (t)} are independent across q, i, s, and t; each variable in {q(t)} and each 
variable in {c qis (t)} are mutually independent. In fact, the model can be generalized to allow 
for query arrivals correlated over time and across keywords, and other similar correlations inside 
the click- through choices or between these two stochastic processes. Such models would only 
make the stochastic analysis more cumbersome, but the main results will continue to hold under 
these more general models. 

In order to guarantee that the Markov chain which we will define later is both irreducible and 
aperiodic, we further assume that the probability of whether there is an arrival in a time slot 
v E (0, 1). We also assume that r qi for all q and i can only take integer values. Together with 
the fact that b(k) takes integer values, {Q(k)} becomes a discrete-time integer-valued queue. 
Note that assuming integer values is only for ease of analysis, but not necessary. 
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C. An Upper Bound on the Overdraft 

According to the ad assignment step ©, if at the beginning of budgeting cycle k, Qi(k) > 1/e, 
then for this budgeting cycle, the i th row of M*(t, q, Q(k)) is always a zero vector, i.e., the service 
provider will not post the ads of client i until Qi(k) falls below 1/e. Since by assumption the 
number of query arrivals per time slot is upper bounded, for any budgeting cycle k, one can 
bound the transient length of each overdraft queue as below: 

Qi(k) <- + N ■ argmax{r 3i c 3 j S } - Vi. 
e q,s 

Therefore, Qi(k) ~ 0(l/e) for all i, and stability is not an issue for these "upper bounded" 
queues. It further implies that this online algorithm satisfies the budget constraints in the long 
run, i.e., for all client i, 

- K-X 



lim E 



K 

k=0 



< h (9) 



must hold. 

It should be mentioned that in |[T2ll . through using the LIFO queueing discipline, the authors 
show an 0((log(l/e)) 2 ) bound on the averaged waiting time encountered by most of the packets, 
which is tighter than the bound 0(1/ e) under the FIFO queueing discipline (see e.g. ifTTIl : our 
above result also fits this bound). While the length of a FIFO queue is proportional to the arrival 
rate according to Little's law 0, the length of a LIFO queue in |[T2| is still 0(1/ e), even if it is 
occupied by very "old" packets which only accounts for a negligible fraction 0(e log ^'^) of all 
the packets that have arrived. Unlike in a communication network where waiting time is usually 
the main concern and dropping a small fraction of old packets does almost no hurt to many 
online applications, what clients of online advertising service care about is how much they have 
paid beyond their budgets, which is measured by the overdraft queue in our model. 



D. Near-Optimality of the Online Algorithm 

We now show that, in the long term, the proposed online algorithm achieves a revenue that is 
close to the optimal revenue -R(p*) (where p* is the solution to the optimization problem O). 
We start with the following lemma: 

Lemma 1: Consider the Lyapunov function V(Q) = \ ^ Ql- For any e > 0, and each time 
period k, 

E[V(Q(k + l))|Q(fc) = Q] - V(Q) < -j (R(p*) - R(p*(k, Q))) + B t - B 2 ^ Q t . 

i 

Here, 

B 1 4 1 f(N(N - 1)L 2 + NL)( a igm a x{c qiS r qi }) 2 

+X)rM 2 (6i- LM) + LM 2 (i-fc + LM))> ( 10 ) 

i 

where L is the number of webpage slots; 

B 2 = min{bi-N^2u q ^2 p* M ^2M is c gis r gi }; (11) 

q M £M q s 
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and p*(fc, Q) = {p* M (k, Q), Vg, M £ M q } where p* qM (k, Q) equals 1 if M = M*(t, q, Q) for 
kN <t< kN + N- 1 (i.e., the optimal matrix in the maximization step ©) and otherwise. 



o 



The proof is given in Appendix lAl 

Now we are ready to present one of the major theorems in this paper, indicating that the 
long-term average revenue achieved by our online algorithm is within 0(e) of the maximum 
revenue obtained by the offline optimal solution. The proof is given in Appendix |B] 



Theorem 1: For any e > 0, 

K-l 



< lim E 



k=0 



- N 



for some constant B\ > (defined in (flOl) in Lemma [T]), where R(k) = YliAi{k, Q(k),u(k)). 
is defined as the revenue obtained during budgeting cycle k. o 

Remark 1: If we choose a very small e, the matching in © behaves like a greedy solution 
until the queue lengths grows comparably large. This indicates a tradeoff between how close to 
the long-term optimal revenue the algorithm can achieve and the actual convergence time. 

Additionally, supposing that {r qi } and are both measured in another scale with a factor 
a, e.g., using cents instead of dollars (a = 100), and assuming that a is unknown, it can be 
shown that the O(e) convergence bound will also be scaled by a if we measure the revenue in 
the original scale. To change the algorithm into a "scale-free" version, {r qi } and {bi} should 
be divided by a common benchmark value, e.g., the largest budget specified by all the initially 
existing clients. Since the benchmark value is also implicitly multiplied by a if measured in 
another scale, the scaling factor will be canceled in the normalized {r qi } and {bi} and no longer 
affect the convergence bound. o 



E. Impact of Click-Through Rate Estimation 

In our online algorithm, the decision of picking an optimal ad assignment matrix in © in 
response to each query is based on the true click-through rates c. In reality, an estimate c based 
on historical click-through behaviors is used, i.e., in response to each query for keyword q, which 
arrives in time slot t E [kN, kN + N — 1], we choose the assignment matrix 



(12) 



M*(t, q(t), Q(fc)) e arg max V" M is c q{t)is r mi I Qi(k) 

We then have the following corollary in addition to Theorem \T\ in Subsection IIII-D1 



Corollary 1: Assume that the estimated click-through rates c £ [c(l — A),c(l + A)] with 
some A £ (0, 1). Under our online algorithm with estimated click-through rates, Q(k) is still 
positive recurrent. Then, for any e > 0, 



lim E 

K— ¥oo 



K-i 



KN 



k=0 



> 



A 



Ell 

N ' 



for some constant Bi > (defined in equation (flOl) in Lemma [[])• 



o 



1 1 



Proving this needs some minor changes to the proof of Lemma Q] and Theorem [T] which will 
be shown in Appendix O 

Remark 2: Corollary Q] tells us that for small e, the long-term average revenue achieved by 
our online algorithm with estimated click-through rates will be at least (fr^) of the offline 
optimal revenue. o 



F. Underdraft: Staying under the Budget 

In the previous sections, we allowed the provision of temporary free service to clients, which 
we call overdraft. If this is not desirable for some reason, the algorithm can be modified to have 
non-positive overdraft. We do this by allowing the queue lengths to become negative, but not 
positive. The practical meaning of negative queue lengths is to allow each client to accumulate a 
certain volume of "credits" if the current budget is under-utilized and use these credits to offset 
future possible overdrafts. We call this negative queue length "underdraft." Corresponding to this 
mechanism, we modify our online algorithm as follows: in response to each query for keyword 
q, which arrives in time slot t E [kN, kN + N — 1], choose the assignment matrix 

M*(t, q(t), Q(k)) G arg max V] M is c q{t)is r mi (r< - Qi(k)) , 

and at the end of budgeting cycle k, for each client i, update 

Q t {k + 1) = max{Qi(k) + A(k, Q(k), u(k)) - bi(k), -Q}, 

where I" 1 , denotes a customized "throttling threshold" (not necessarily 1/e) and Ci denotes 
the maximum allowable credit volume for client i. Recall that A^k, Q(k), u(k)) is defined 
in equation ©. 

We can bound each overdraft queue as below: 

Qi(k) <Ti + N ■ argmax{r gi Cgi S } - [bi\ , Vi, k. 

q,s 

Thus, if our objective is to eliminate overdrafts (i.e., Qi(k) < for all k), we can set 

, Vi, (13) 

where in contrary to [x] + , [x]~ takes the non-positive part of x, i.e., [x]~ = x if x < or 
[x}~ = otherwise. We further let 

Q := T h Vz, 

e 

so that after converting Qi(k) to be nonnegative by using Qi(k) = Qi(k)+Ci for all i, everything 
is transformed back to the original online algorithm except that each Qi(k) is replaced by Qi(k), 
hence we can still show that the revenue achieved by this modified version of online algorithm 
is within 0(e) of the optimal revenue. 

It might seem counter-intuitive that by letting e go to zero, we can incur potentially large 
underdrafts (under-utilization of the budget) and yet are able to achieve maximum revenue. This 
is not a contradiction: for each fixed e, in the long term, the average service provided to each 
client is close to the average budget. The 0(l/e) is a fixed amount by which the total budget 



IAJ - N 



arg m&x{r qi c qis ] 



q.s 
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up to any time T is under-utilized, and, after divided by T, it goes to zero when T approaches 
infinity. 

We note that while an underdraft does not seem to significantly hurt either the client, who 
actually benefits from an underdraft, or the service provider, whose long-run average revenue is 
still diminished only by 0(e), large values of the underdraft may result in temporary unfairness 
in the system^ If, for example, a client accumulates a large underdraft compared to the other 
clients, then it may receive priority over other clients for large periods of time. To illustrate 
this, we consider an example with two clients and one queried keyword. Assume that r, < 
for i — 1,2, and at time slot ko, Qi(ko) = r\ and (52(^0) = —C2 (this occurs with a positive 
probability due to the ergodicity of the Markov chain {Q(/c)} proved before). We simulate the 
sample paths of the weights in the maximization step (1321 with the following setting: budgets 
b\ = b 2 = 0.6, click-through rates c\ = c 2 = 0.5, revenue-per-click r± — r 2 = 1; the number 
of query arrivals per time slot equals 2 w.p. 0.5 and otherwise; a budgeting cycle equals to 
one time slot (N = 1) for simplicity. The results for both e = 0.01 and e = 0.005 (k — 
corresponds to k here) are shown in Figure CD Client 2 keeps getting services until the weights 
of both clients reaches the same level, and the smaller e is, the longer the "unfair serving" period 
lasts. 

It should be mentioned that this underdraft idea can be used under any upper-bounded query 
arrival model, not restricted in the Bernoulli arrival model considered in this paper. 

IV. A Universal Lower Bound on the Expected Overdraft Level 

We want to show that in the long run, an expected overdraft level of Q(log(l/ e)) is unavoidable 
under any stationary ad assignment algorithm which achieves a long-term average revenue within 
O(e) of the offline optimum, when the queue length is only allowed to be nonnegative. An ad 
assignment algorithm w is defined as a strategy which uses matrix M m {t,q) G Ai q for ad 



4 Note that this temporary unfairness is not an artifact of the underdraft mechanism. In fact, it occurs once a sample path 
enters a state where some clients have huge differences from others in their corresponding queue lengths, which can also happen 
under the original algorithm. We are just using the underdraft scheme to illustrate this phenomenon. 
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assignment when a query for keyword q arrives at each time slot t. During each budgeting cycle 
k, the revenue obtained from client i under algorithm zu is defined as 



kN+N-l 



A T(k)= J2 Yy MW ^^))^^mis{t)-r m . (14) 



t=kN 



We then define average revenue obtained from client i per budgeting cycle as \f = E[Af{k)\ 
in the steady state. The long-term average revenue (per time slot) is thus R m = J2i and 
the overdraft level of client i evolves as 



(15) 



Note that our online algorithm is one particular zu, which makes the decision based on the 
current overdraft levels of all clients. 

To seek a universal lower bound on expected overdraft level in the long run (here, equivalent 
to steady state), we only have to consider those algorithms zu such that Qf = E[Qf(k)] < oo 
for all i. To categorize these "stable" algorithms, we define "per-client revenue region," similar 
to the concept of "capacity region" in the context of queueing networks: 

Definition 1 ("Per- Client Revenue Region"): 

C 4|A ro = {Af}>0: 3zu s.t. \f = E [Af{k)\ < b h V*} , 

given fixed parameters {r qi }, {bi}, {c qis }, N and statistical properties of q(i) and {c qis (t)}. o 
The offline optimal average revenue is then equal to max^ec Yli K/N, which is denoted as R*. 

Note that if the query arrival rates per budgeting cycle are too low, the average revenue drawn 
from some client will never hit its specified budget, no matter which algorithm zu s.t. A ro G C 
you pick (i.e., 3 i s.t. no feasible solution p can make constraint © for this i tight). The system 
resources (here, budgets) are underutilized and it is not so important to consider the tradeoff 
between revenue and overdraft. To avoid this, we can assume a relatively large N (i.e., the 
number of time slots in one budgeting cycle) such that 

N > max J = ^ I , (16) 

1 [ Z^q V 1 T qi ■ maX MeM i C qis ^M) J 

where M. 1 C M. q is defined as a set of ad assignment matrices, of which the i th row has a "1", 
and s(i,M) in CqisUM) refers to the column in M where that "1" stays. This guarantees that for 
each i, there exists an algorithm zui such that A ro » e C and Xf' = bi. The reason is that 
In the following text, we will assume the above condition for N. 



A. One Keyword, One Client and One Webpage Slot 

We start with the simplest model: one keyword, one client and one webpage slot (hence 
we omit all the subscripts in the corresponding notations). Under condition (fT6l) . the offline 
maximum average revenue is trivially b/N. 
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Theorem 2: Given a small e > 0, if an algorithm w leads to E[A™{k)] > b — e in the steady 
state, then 

V " 2(1 - log(^P + )) ' 

where we assume that 

ip = Pr(no query arrival in a budgeting cycle) > 0, 
and P + = Pr(b{k) > 0) > 0. o 

Note that this result works for any query arrival and budget spending model satisfying the 
above two stated assumptions, and not only restricted to the model we described in Subsection 
IIII-BI In the proof below, we generally write b(k) as a random variable which can possibly take 
all nonnegative integer values. 

Proof: We ignore the superscript w for brevity. The dynamics of the queue is rewritten as 
Q(k + 1) = Q(k) + A(k) — b(k), where the actual departure process is defined as 

y k) a f b(k) if Q(k) + A(k) - b(k) > 0; m) 
\ Q(k) + A(k) otherwise. v J 

Let pi = Pr (b(k) = i) and q,i = Pi(b(k) = i) in the steady state. Note that 

oo oo 

b-e < E[A{k)\ = E[b(k)} = P*(H k ) > *) = Pr(S(A;) > 1) + ^ Pr(b(k) > i) 

8=1 1 = 2 

(a) 00 

< (l-p ) + ^Pr(6(fc) >i) = l-po+(b-PT(b(k) > 1)) =l-p + 6-(l-?o) 

i=2 

= qo-po + b, 

where (a) holds because Pr(b(k) > i) < Pi(b(k) > i) for all i > 0. Thus, p < q + e. Since 

Pr(6(fc) = 0) = Pr(6(fc) = 0) + Pr(6(fc) = 0, b(k) > 1), 
we have p = q + p , where p = Pr(6(/;;) = 0, > 1). Therefore, 

Po < e. (18) 

Next, we are looking for a lower bound on p in relation to Q. Letting P + = Pi(b(k) > 0) 
(which is surely a positive constant since b > 0), we then have 

n— 1 ^ /n— 1 

np = ^Pr(6(A;) = 0, 6(fc) > 0) > Pr I (J{6(fc) = 0, b(k) > 0} 

fc=0 \fc=0 
(6) 

> Pr(Q(0) < n - 1; A(Jfe) = 0, b(k) > 0, V0<Kn-l) 

n-l 

= Pr(Q(0) < n - 1) Yl Pr(A(k) = 0) • Pr(6(fc) > 0) 

fc=0 

= (ifP + ) n ■ Pr(Q(0) < n - 1) > (^P + ) n (1 - Q/n) , (19) 



15 



where (a) holds according to the union bound, (b) holds since the event on the RHS implies 
the one on the LHS, and (c) holds due to the Markov inequality. If we pick n := \2Q~\ G 
[2Q, 2 (0 + 1)], inequality (03 further impl ies that 

p > ^ P +} n > e -n(l-log( (/3 P+)) > e -2(Q+l)(l-log( ( pP + ))^ (20) 

where (e) holds because j- > e~ x for all x > 0. Combining inequalities (PT8l) and ([201) then 
completes the proof. ■ 

In the related literature, comes up with an Q(l/y/e) bound for a set of algorithms under 
some admissibility conditions, while ll25l provides an f2(log(l/e)) bound for more general 
algorithms. 

Our proof uses the following ideas inspired by [|25l : if the throughput is lower bounded by a 
number close to the average potential departure rate, then the probability of zero actual departures 
given nonzero potential departures must be upper bounded by a small number; further, if the 
average queue length is given, then the probability of hitting zero must be upper bounded because 
otherwise, the queue length would become small. However, we cannot directly use the expression 
for the lower bound in ll25l since it imposes certain strict convexity assumptions which do not 
apply to our model where the objective is linear. So we have provided a very simple derivation 
of the lower bound on the queue length for our specific model. 

Additionally, our f2(log(l/e)) bound based on a linear objective function can be extended to 
the multi-queue case (in Subsection IIV-BD . The Q(l/y/e) bound in fl3j has been extended to the 
multi-queue case in (241 but still under strict convexity assumption and for a restrictive class of 
algorithms. Whether the f2(log(l/e)) bound in ||2~5l can be easily extended to multiple queues 
still remains a question. 



B. Multiple Keywords, Multiple Clients and Multiple Webpage Slots 

We now extend this lower bound to the original general model, which can have multiple 
keywords, multiple clients and multiple webpage slots. It is easy to see that the "per-client 
revenue region" C in Definition \T\ is a polytope, which can then be rewritten as 

C = | A > : ^ V 1 < n < l| , (21) 

where > and > for all i and n.The outer boundary of the polytope C consists of 
the L hyperplanes, i.e., £^ h^Xi = d^ for all n e [1, L]. 

Under condition (TT6l) . L is at least equal to the number of clients (i.e., number of budget 
constraints), so (12TI) gives a more precise description of the stability condition for this "multi- 
queue system," compared to the original definition of C. Thus, corresponding to the normal 
vector of each hyperplane, we convert the original multi-queue system into a new one with L 
queues: For each n G [1,L], we first scale the i th queue described in (fT5l) by h[ n \ so that it 
has a queue length equal to h\ ■ Qi(k), with h^A^h) arrivals and h\ n ^bi(k) potential departures 
in time slot k, for all i. Next, we treat Yli^i^Qiik) as the n th queue, and since any A G C 
satisfies ^ih^Xi < d^ n \ its maximum achievable average departure rate equals dS n \ where 
d(n) < ^2 h^bi, because the potential departure rate of each individual scaled queue may not 
be fully achieved when all of them are coupled together. 
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We then come up with the formal definition of the class of algorithms which achieves a "near- 
optimal" average revenue. 

Definition 2 ("e-Neighbourhood" of the maximum): Let A* be one optimal point in C such 
that Yli \* — R*' The e-neighbourhood of A* is defined as 

M e = {A ro G C \ dC : < N ■ (R* - FT) < e}, (22) 

where dC represents the outer boundary of C, and it should be noted that the average revenue 
is evaluated per time slot while A is evaluated per N time slots. o 

Note that in the above definition, since A ro G M e is not on any boundary, R* is strictly larger 
than R™, which is easy to see from some basic principles of linear programming. 

The following theorem shows the universal lower bound Q(log(l/e)) for the general case. 

Theorem 3: For any algorithm w s.t. A ro G Af e , we have 



Al 



fci - ci h 

where ip = Pr(no query arrival in a budgeting cycle) = (1 — v) N > 0, P + = Pr(bi(k) > 
0, Vi) > 0, and 

d = 2(1 - log(^P+)) • max/if G (0, oo), 

i,n 

C 2 = max{log(max^ (n) ),0} G [0,oo). (23) 

i,n 

O 

Proof: We ignore the superscript zu for brevity. According to some basic principles of linear 
programming, an optimal point A* is at a corner of C. If there are several optimal points, any 
convex combination of them is also optimal. Denote this optimal point sets as A* and VA* G A*, 
3 n* G [1,L], s.t. J2i hf ] K = d^*\ 

Given a A G M e , 3 6 s.t. £\ Q; t = J2i K an d ®% > Aj for all % (but at least one inequality is 
strict). Besides, for this 0, 3 n G [1,L], s.t. Y.i h f )e i > d{h) (otherwise, eC\dC will hold 
and hence £\ 9i < J2i K> which leads to a contradiction). Therefore, 

dW-^hfXt < Y^hPft - Aj) < - Aj) 

i i i 

= h%lJ2(K ~ ^i) < h%le, (24) 

i 

where hmlx — maxj /i^ > and inequality (a) holds because 0, > Aj for all i. Letting P' + = 

PrQ^/if^fc) > 0), it is easy to see that P' + > Pr(6i(Jfe) > 0, Vz) = P + > 0. Together with 
Theorem [2l we can conclude that 



v- , ( fl )^ . MjA) ~ log(^L) _ log(l/e) - log(k w 



fi Ki.r I 



2(l-log(^P;)) " 2(l-log(^P+)) 
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Fig. 2: An illustration of the idea in the proof of Theorem [3] 



Since ^ h\ n ^Qi < hmix ^ Q%, it is further concluded that 



> log(l/6) - log(^L) 1 > log(l/e) - C 2 
~ 2h%L(l - log(^P+)) Ci 



where the universal constants are defined in (T23l) . and it is guaranteed that C\ E (0, oo) and 
C 2 G [0, oo). This completes the proof. ■ 

Remark 3: We briefly explain the idea behind choosing 6 in the above proof: For those A G AC 
such that Aj < A* for all i (at least one is strict), 6 can be directly chosen as A* to make 
inequality (a) in (l2"4l) hold. But for the other A G Af e which do not satisfy the above condition, 
it is necessary to introduce a other than A*, which both lies on the "maximum revenue line" 
(i.e., J2i @i = Yli an d dominates A component-wise, in order to derive inequality (l24l) . Note 
that 6 is not unique and furthermore, lies either on dC or in the exterior of C and it can be 
chosen as a boundary point only if the optimal revenue point is not unique. Figure [2] illustrates 
this idea using an example with one keyword, two clients and one webpage slot, specifically for 
showing where such a is located. 

The basic idea in our proof is to use Theorem [2] to first get a lower bound for those new 
single queues written as a "weighted sum" of the original queues (described above). This idea is 
similar to one part in the proof for the lower bound on the expected queue length of a departure- 
controlled multi-queue system in ll2~6ll . but some technique in their proof cannot directly apply 
to arrival-controlled queues like ours. 

C. Tightness of the Lower Bound 

We want to show that the Q(log(l/e)) universal lower bound is tight, i.e., achievable by some 
algorithms. Consider the following simple queueing model: the arrival process a(k) is i.i.d. 
across time, a(k) = 2 w.p. v and a(k) = otherwise. The service rate is constant and equal 
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to 1. Assume that v G (1/2, 1). With the controlled arrival process a{k), we want to achieve a 
throughput E[a(k)] > 1 — e for a given small e > 0. A "threshold policy" based on a threshold 
T is proposed below: 

• When Q(k) > T, reject all arrivals; 

• When Q(k) = T, accept one arrival w.p. pi, accept two arrivals w.p. p 2 , and reject all of 
them otherwise. 

• When Q(k) < T, accept all arrivals. 

Defining 7T; as the steady-state probability that Q(k) = i(0<i<T + l) for the resulting 
Markov chain, the local balance equations are given below: 



71 if 


= 7Ti+U 


T-l 'Pi" 


= M 1 




= 71 

" T+1 ' 


T+1 






= 1. 


i=0 





(25) 



Combining these equations with the throughput requirement, we get 



T-l 



2^7Ti + 7r T (2p2+Pl) 



(26) 



V 

i=0 

and one can finally show that (ignoring detailed calculations) 

T _ log(lA) + logC(f) 
log fe) ' 

where 

c( ) a (2^-l + e)(l-i/(p 1 +p 2 )) 
U 1/(2- 2(1 -z/)p2-pi) 

The above result further implies that Q ~ 0(log(l/e)). we can also see that as v — > 1, T — > 0, 
which is consistent with the fact the lower bound given in Theorem |2] goes to as the "zero 
arrival probability" if — >■ 0. 

Another example showing the tightness of an f2(log(l/e)) bound is the dynamic packet 
dropping algorithm in [|25l (note that this universal lower bound is proved based on a strict 
convexity assumption as mentioned before in Subsection IIV-AI ). 



V. Click-Through Rate Maximization Problem 

In this section, we consider another online ads model, in which the objective is to maximize 
the long-term average total click-through rate of all queries. Instead of average budget, client i 
specifies in the contract an average "impression requirement" which is the minimum number 
of times an ad of this client should be posted by the service provider per "requirement cycle" 
(equal to N time slots) on average. The other parameters are the same as in the model proposed 
in Section [I] for the revenue maximization problem. 

The corresponding optimization formulation now becomes 

max J(p) = v q P q m M is c qis (27) 

q MeMq i,s 
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where the feasible set J 7 is characterized by 

N E u * *v M ™ ^ m - Vz; (28) 

q MeM q s 

< PqM < 1, Vg, M G M,; (29) 
E *U < 1, Vg. (30) 

M&Mq 

Different from the revenue maximization problem, here the feasible set can become empty if 
some 771, is too high. Basically, without constraint (|28l) , T is relaxed to 

= {P : < PqM < 1, Vg, Me^; ^ *V ^ X > V <^ ( 31 ) 

MeM, 

We can then define the following capacity region which characterizes how large the average 
number of impressions can be achieved for each client per requirement cycle: 

[ q MeM q a J 

Clearly, m G C must hold to ensure the existence of a solution for the above optimization 
problem. 

Through a similar approach as in Subsection IIII-Al we can write down a similar online 
algorithm based on the same stochastic model as defined in Subsection IIII-BL We define q(fc) = 
{q(t), for kN < t < kN + N — 1}. Similar to bi(k), m(k) = \rrii] w.p. m, — [rrii\ and 
fh(k) = [rriij otherwise. 



Online Algorithm: (in each requirement cycle k > 0) 

In each time slot t G [kN, kN + N — 1], if q(t) > 0, choose the assignment matrix 

M*(t,q(t),Q(k)) Garg max V M 1S f^ + Q^)) . (32) 

At the end of requirement cycle k, for each client i, update 

Q,(k + 1) = [Qi{k) + ro(A;) - Si(k, Q(k), q(k))] + , 

where 

kN+N-l 

Si(k,Q(k)Mk))= E E[^*^(*),QW)W (33) 

t=fcJV s 



In real online advertising business, some clients may only have short-term contracts, i.e., 
clients may not be interested in the average number of impressions per time slot but may be 
interested in a minimum number of impressions in a given duration (such as a day). Further, 
query arrivals may not form a stationary process. In fact, they are more likely to vary depending 
on the time of day. These extensions are considered in Appendix E. Such extensions also make 
sense for the revenue maximization model considered in the previous sections, but the approach 
is similar to Appendix E and so will not be considered here. 
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A. Performance Evaluation 

Si(k,Q(k),q(k)) defined in (1331) represents the actual number of impressions for client 
i's ads during requirement cycle k. The queue length increases when the average impression 
requirements in a particular requirement cycle cannot be fulfilled. Hence, a positive queue 
represents accumulated "credits," which enhances the chance of being assigned with a webpage 
slot in the future, much like a negative queue in the revenue maximization problem. We thus 
call this queue a "credit queue." 

Unlike the revenue maximization problem in which an 0(l/e) upper bound on the transient 
queue length is automatically imposed by the online algorithm, here we need to prove the stability 
of the queues and show an upper bound on the mean queue length. Since {Q(k)} defines an 
irreducible and aperiodic Markov chain, in order to prove its stability (positive recurrence), we 
will first bound the expected drift of Q(k) for a suitable Lyapunov function. 

Lemma 2: Consider the Lyapunov function V(Q) = F° r an Y e > and each 

requirement cycle k, 

E[V(Q(k + l))|Q(fc) = Q] - V(Q) < ^ + D 1 - D 2 ^ Q,. (34) 

i 

Here, 

D x 4 I(V(iV-l)L 2 + Aa + ^Kl 2 K- KJ)+ [m l \ 2 (l-m i + [m*])), (35) 

i 

where L is the number of webpage slots; 

D 2 ^mm{Nj2^ vE 1 ' 8 "^' (36) 

q MeM q s 

for some pe J such that D 2 > 0; and 

D 3 = N- max J(p) (37) 

pe^o 

where JFq is defined in (T3TT ). o 

The proof is similar to the proof of Lemma Q] with some modifications in the final steps, which 
will be given briefly in Appendix [Ql With this lemma, we can conclude that Q(k) is positive 
recurrent because the expected Lyapunov drift is negative except for a finite set of values of 
Q(k), according to Foster-Lyapunov theorem ([0, ||23l ). 

Remark 4: Note that compared to the definition of B 2 in ([111) of Lemma \T\ where B 2 > 0, D 2 
needs to be strictly positive in order to prove the stability of queues. Such a p in the definition 
of D 2 can always be found unless J 7 is a degenerate set with at most one element. o 

The stability of the queues directly implies the following corollary: 

Corollary 2 {Over services in the long term): 



lim E 



1 K 

-5>(fc,Q(fc),q(A;)) 



K 

k=l 



> rrii, Vz. 



o 
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In addition to proving stability, Lemma [2] will be used to evaluate the upper bound on the 
expected total queue length in the steady state, as shown in the following theorem: 



Theorem 4: Under the online algorithm, 



E 



oo 



where D\ and D 3 are respectively defined in (1351) and (l37k D 2 is defined as 

D* 2 = maxD 2 (p). 

where D 2 is defined in (1361) (regarded as a function of p). 



(38) 



(39) 



o 



Proof: Averaging both sides of inequality (1341) over < k < K — 1, taking Ji — >■ oo and 
doing some simple algebra, one obtains 



K-l 



fc=0 



< 



D 7 



D 1 



e 



The LHS equals to E \^2 i Qi(oo)} according to Theorem 15.0.1 in [|23l . The RHS is minimized 
through maximizing D 2 over all pe Jo (which will certainly satisfy pG J and _D 2 > 0). This 
completes our proof. ■ 



The following theorem shows that the online algorithm proposed above achieves a long-term 
average click-through rate within 0(e) of the offline optimum. The proof is similar to the one 
for Theorem Q] and hence will be omitted. 

Theorem 5: For any e > 0, 



< lim E 



k=0 



< 



N ' 



for some constant D\ > (defined in (T33T ) in Lemma [2]). Here, J(k) is defined as the total 
number of click- through events within requirement cycle k. o 



B. Customizing Impression Requirements {rrii} Based on Query Arrival Rates {u q } 

Since a positive queue measures how much the service provider "owes" a client, reducing the 
coefficient of the 1/e term in the upper bound on the mean queue length becomes important. 
Besides, we also need to guarantee m G C. In order to handle these two issues, we introduce 
an approach to customizing {m^} based on known (or estimated) query arrival rates {u q }, 

Replacing D\ in Theorem |?] by a common D 2 defined in equation d36l) . if we want the expected 
total queue length to be upper bounded by Q ma x, it suffices to let 

D 2 > C = — (d 1 + ^ ) , (40) 
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where D 3 is already determined, and D\ does not matter much given a small e although it 
includes unknown {m^}. We then solve the following optimization problem to determine {mi}: 

max y logmj 
s.t. ivJ^V 3 S *W M " _ mi ^ Z> v *- 

q AlGMq s 

Here we use £\ log m 8 as the objective function in order to guarantee a unique optimal solution 
and impose a certain fairness rule called "proportional fairness" (see e.g. IfTTI ). Note that £ 
cannot be set too large (i.e., Q ma x cannot be set too small), otherwise there may not exist a 
feasible solution. 

Naturally, a question would arise: now that we need to solve some mathematical programming 
like the above one based on knowledge of query arrival rates, why not also directly solve the 
original linear programming in (1271) and use the offline optimal solution p* to assign ads? The 
answer to this is similar to the max- weight algorithm for wireless networks. In |[27l and [|29l , it 
has been shown that adaptive algorithms lead to much better queueing performance compared 
to static offline algorithms. We verify this assertion in our context through simulations in the 
next subsection. 



C. Queue Update in a Faster Time Scale 

In the original algorithm, the queue length is updated only at the end of each requirement 
cycle and used in the max-weight matching for the next whole requirement cycle. The longer 
a requirement cycle lasts, the more obsolete the queue length information becomes, so with a 
large N, short-term performances may not be so good even if long-term performances are still 
guaranteed. 

We then propose a solution which updates queue lengths in a faster time scale. Specifically, we 
divide each requirement cycle into T queueing cycles with equal lengths (assuming N/T 6 Z + 
without loss of generality). We use {Q(k,r) : < r < T}k>o to denote this new queueing 
system and assume Q(— 1,T) = 0. At the beginning of each requirement cycle k before any 
decision, update 

Q(fc,0) = Q(fc- l,T) + m(fc), 

and at the end of the r th queueing cycle within this requirement cycle (1 < r < T), for all 
client %, 



Qi{k,r) 



N 



kN , . 



Qi(k,T-l)- ^[M*(t,g(t),Q(fc,r))] 2 



t=fcjV+( T -l)f 



Since ||Q(fc, T) — Q(fc)|| < B for some constant B independent of the queue lengths, it can 
be shown that the long-term performances evaluated in Subsection IV-AI are still guaranteed (the 
idea behind such a proof would be similar to the one in and so is omitted). 

Next, we use simulations to compare three different algorithms, namely a randomized algo- 
rithm following the offline optimal solution (labeled as OPT) and two versions of our online 
algorithm "max-weight matching" with and without "fast queue update" respectively (labeled as 
MWM-Fast and MWM respectively). In each scenario we test, all the parameters are randomly 
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(a) Over-Service (b) Under-Service 

Fig. 3: Average overall over-service and under-service (normalized by the total impression 
requirement) impacted by the "fast queue update" 
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Fig. 4: The standard variance of overall over-service and under-service (normalized by the total 
impression requirement) impacted by the "fast queue update" 



generated. The impression requirements {m,} are chosen through the approach in Subsection 

We take an example scenario with 2 webpage slots, 5 keywords and 10 clients. The probability 
that a query arrives in a time slot equals 0.7. Specifically, for the five keywords, the query arrival 
rates are i> = [0.2364,0.0594,0.1669,0.0714,0.1659]. Table U shows the click-through rates for 
the ten clients (Ci ~ C w ) corresponding to each keyword (qi ~ q 5 ), on webpage slots 1 and 
2 respectively (a zero click-through rate indicates that the corresponding client is not related to 
this keyword). We use N = 1440 (say, one time slot is one minute and one requirement cycle is 
one day), e = 10~ 4 and Q ma x = 20/e (recall that Q ma x is used to set up an upper bound on the 
mean queue length by the heuristic in Subsection IV-BI) . The simulation has been run for 1000 
requirement cycles. 

To compare the performances of all the three algorithms, instead of considering the long- 
term performance requirements that we have used in the theory, we introduce two new metrics: 
over-service S^(k) = [Si(k) — fhi(k)} + and under-service S^(k) = [fhi(k) — Si(k)} + to client i 
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TABLE I: Click-through rates for all the clients' ads 



during requirement cycle k. Note that these metrics measure deviations from the guarantees over 
short time scales and so are more stringent requirements than the long-term guarantees used in 
the theory. 

We show respectively in Figures |3(a)| and |3(b)| that the average overall over-service and 
under-service normalized by the total impression requirement, i.e., E[^2 i S^(k)]/ £\ m, and 
-^E? ^T(^)]/ Ei m i' are both reduced by the fast queue update. Similarly, a "variance reduc- 
tion" effect is shown by the fast queue update based on the statistics ^/ mr Ei ^t(^)}/ Yli m i 
and \/var[^2 i S~(k)}/ mi, respectively in Figures |4(a)| and |4(b)[ In terms of the overall 
click-through rate, our simulation has verified that the three algorithms achieve approximately 
the same performance (the figure is omitted here) and further demonstrated in Figure \5\ that the 
fast queue update can also reduce its variance. Note that these performances of each individual 
client also improve and we simply omit the figures here. 

Observed from Figure [6l the offline optimal solution leads to very unstable queue dynamics. 
This essentially arises from the fact that the algorithm operates on an optimal point p* for which 
some inequalities in constraint (T2~8l) may be tight. In contrast, our online algorithm guarantees 
the stability of queues, and the faster the queues update, the more stable the queue dynamics 
become (as an example we use T = 24, i.e., the number of time slots per queueing cycle equals 
60). This is consistent with the above results which show a reduction of over-service and under- 
service in both mean and variance since these metrics directly measure the level of deviations 
around the equilibrium point of each stable queue. 



Remark 5: While a long-term client may only be concerned with average performances, a 
short-term client cares about both mean (the average level for all the clients of its type) and 
variance (related to its own individual level), especially for the performances of under-service 
and click-through rate0 All of these are well handled by our online algorithm with fast queue 
updates. 

VI. Conclusions 

In this paper, we propose a stochastic model to describe how search service providers charge 
client companies based on users' queries for the keywords related to these companies' ads 



5 Over-service are cared about by the online ads service provider. 
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Fig. 5: The "standard variance to mean ratio" of overall click-through rate impacted by the 

"fast queue update" 





Fig. 6: Queue dynamics under three algorithms 



by using certain advertisement assignment strategies. We formulate an optimization problem 
to maximize the long-term average revenue for the service provider under each client's long- 
term average budget constraint, and design an online algorithm which captures the stochastic 
properties of users' queries and click-through behaviors. We solve the optimization problem by 
making connections to scheduling problems in wireless networks, queueing theory and stochastic 
networks. Our online algorithm is entirely oblivious to query arrivals and fully adaptive, so even 
non-stationary query arrival patterns and short-term clients can be handled. 

With a small customizable parameter e which is the step size used in each iteration of the 
online algorithm, we have shown that our online algorithm achieves a long-term average revenue 
which is within O(e) of the optimal revenue and the overdraft level of this algorithm is upper 
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bounded by 0(l/e). By allowing negative values for the length of overdraft queues, we can 
eliminate overdraft. 

When estimated click-through rates instead of true ones are used in our online algorithm, we 
show that the achievable fraction of the offline optimal revenue is lower bounded by i=^, where 
A is the relative error in click-through rate estimation. 

We also show that in the long run, an expected overdraft level of f2(log(l/e)) is unavoidable 
(a universal lower bound) under any stationary ad assignment algorithm which achieves a long- 
term average revenue within 0(e) of the offline optimum. The tightness of this universal lower 
bound is also shown for a simple queueing model using a threshold policy. 

In another optimization formulation where the objective is to maximize the long-term average 
click- through rate and the constraints include a minimum impression requirement for each client, 
we further propose an approach to set impression requirements which make the contract feasible 
and limit the average accumulated under-service to clients. Simulations show that making queues 
update in a faster time scale will reduce both over-service and under-service, which benefits a 
system involving short-term clients. 
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Appendix 



A. Proof of Lemma \T\ 



£[V(Q(*+1))|Q(*0 = Q]-V(Q) 
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< Ie 

~ 2 
= E 



Y J ([Q^MkMMk))-%{k)\ \ -Q] 

i * ' 

(Qi + Mk, Q, u(fc)) - 2 - Q 

i 

Qi (Mk, Q, u(fc)) - h(k)) + l -Y. {Mk, Q, u(fc)) - bi{k) 

i i 

< Q* ( A ^' Q) - **) + \ E^t^^' Q> u ( fc ))] + E &(k)}), (4i) 

i i 

where it was already defined in equation ([8]) that for all i, 

kN+N~l 

Ai(k, Q(fc), u(fc)) = Yl Q(k))] is ■ c mis (t) ■ r m . 

t=kN s 

and we further define 

\(k,Q(k)) = E[Ai(k,Q(k),u(k))\Q(k)] = A ^ z/ g ^[M*(g, t, Q(A;))] isW9l . 

q s 

Since each client can at most get one webpage slot for each query, we can further bound 

YX 2 (fc, Q, u(k)) < (A(A - 1)L 2 + AL)(argmax{ W(?J }) 2 . 



q,t,s 



Besides, 



E{b*(k)} = \b l y Ql + [b l \ 2 {i- Ql ) = \b l -\ 2 {b l -[b l \) + [b l \ 2 {i-b l + [b l \). 
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Thus, by defining 

B, 4 1 -({N(N - l)L 2 + iVL)(argmax{c^r gi }) 2 + J]^] 2 ^ - ^J) + |&ij 2 (l - k + 

i 

and continuing from inequality (|4TT) . we have 
£[V(Q(* + 1))|Q(A0 = Q]-V(Q) 



Yl^*(q, t, Q)] is c qis r qi + B X -J2 

q i,s i 



-'qis'Tqi 



+-R(p*(k, Q)) + - V QA (42) 
< -jvJ^Wl-Q,) ^ p * M M is c gis r qi +jR(p*(k,Q,)) + B 1 -J2Qi b i 

q i,s ^ ' MeMq i 

-Y^Qi-U-N^u, pIm E I > ( 43 ) 

i \ q M£M q s J 

where inequality (a) holds because equation © in the online algorithm is equivalent to 



Vg, p* q (k, Q(fc)) G arg max V p V M is c qis r qi ( - - Q, 

M6M,} MeMg 1,8 



(*) 



(44) 



which means that evaluating the objective function in (1441) with p = p* cannot achieve a larger 
value. Letting 

B 2 = min{6i - JV ^ i/, ^ p* M ^ M is c ?is r gi }, 

g MGM q s 



from inequality (1431) . we finally obtain 

£[V(Q(fc + l))|Q(fc) = Q] - V(Q) < (R(p*) - R(p*(k, Q))) + B x - B 2 ^ Q,. 
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B. Proof of Theorem [7J 

The first inequality which shows that the online algorithm cannot do better than the offline 
optimal solution is too obvious, so we just ignore it here (proving it in a very rigorous way is 
also very easy, after defining the "per-client revenue region" in Subsection II V B I and then using 
the fact that the average revenue vector A corresponding to our online algorithm falls inside that 
region, according to inequality © which is implied by stability). 

We now focus on the second inequality, i.e., the 0(e) convergence bound. From Lemma [H 

E [R(p*) - R{p*{k,Q{k 



< — • E 

~ N 



B 1 -B 2 Y J Q(*0 + V(Q(k)) - E[V(Q(k + l))\Q(k)} 



< --(ft - E[V(Q(k))] - E[V(Q(k + 1))]), 
Adding the terms for < k < K — 1 and dividing by K, we get 

K-l 



k=0 

~ N V K ) 



Since V(Q(0)) < oo, we get the following limit expression: 



K-l 



(45) 



k=0 



Finally, because 

E [NR(p*) - R(k)] =E[E [NR(p*) - R(k)\Q(k)]] =N-E [R(p* 
inequality (|45l) is equivalent to 



R(p*(k,Q(k)))] 



lim E 

K~ s-oo 



k=0 
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5ie 
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C. Proof of Corollary [7J 

Continuing from inequality (l42l) in Appendix [A] (the proof of Lemma [Q), we get 

J5[7(Q(Ar+l))|Q(*)=Q]-y(Q) 

N 
e 



(a) 
< 
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< -YX^Z^SG"^) E P; M M is c^r gi +^(p*(A;,Q)) + J B 1 -5]Q i 6 i 



M<=Mq 



E & • U - 1+^ E ^ E E M ^ c «- r v 

t y g MeX, s 



CqisTqi . (46) 



Here, inequalities (a) and (c) hold respectively because c < c(l + A) and c > c(l — A), with 
the fact that all the coefficients in this summation are nonnegative. Inequality (b) holds because 
equation (PT2l) in the online algorithm with estimated click-through rates is equivalent to 

Vg, p* q (k,Q(k)) G arg max ^ p qM ^ M is c qis r qi (- - Q t (k)j , (47) 

which means that evaluating the objective function in (l47l) with p = p* cannot achieve a larger 
value. Letting 

B' 2 ^min \ -£ " ^E ^ E Km E M is c qis r 

y q MeM q s 

from inequality (1431 ), we finally obtain 

E[V(Q(k + l))|Q(fc) = Q] - V(Q) < ~ (^^(p*) - R(P*(K Q))) +B X -B^ Q t . 
Therefore, similarly as in the proof of Theorem [[J we can finally show that 
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K~ s-oo 



ZX Proof of Lemma [2] 

By a similar approach as in the proof of Lemma Q] (Appendix [Aj), we have 

E[V(Q(k + l))\Q(k) = Q]-V(Q) 
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where D x is an upper bound on \ ^2i(E[S?(k, Q, u(/c))] + E[rhf(k)]) and defined as 

D l 4 I^(Ar_i)L 2 + iVL + ^Kl 2 K- KJ)+ KJ 2 (l-m,+ KJ)). 

i 

Note that inequality (|48T) has the same form as inequality (l43l) in the proof of Lemma Q3 except 
that the offline optimum p* is replaced by some p6 J. Letting 

D 2 = min{iV ^ ^ Ai* Yl Mis ~ 

it is always possible to pick ape J 7 such that D 2 > (unless J 7 is a degenerated set which 
has at most one element). We further bound the above inequality as 

E[V(Q(k + l))\Q(k) = Q] - V(Q) <^ + D 1 -D 2 J2Q l - 

i 

Here, _D 3 = N ■ max pg j J(p) where is defined in (|3TT) . This concludes our proof. 

£. Short-Term Clients and Non-Stationary Query Arrivals 

We focus on the click-through rate maximization problem, although a similar model and 
solution can be used for revenue maximization problem. 

First, consider how to include short-term clients in the system. Let us index long-term clients 
from 1 to n, the i th of which has an average impression requirement of mi per requirement 
cycle. There are further n types of short-term clients indexed by n + 1 < i < n + n. Each 
short-term client of type i has a impression requirement of U per contract term. Without loss of 
generality, we assume that the contract term of any short-term client is equal to one requirement 
cycle. In each requirement cycle k, there are Xj(fc) clients of type i in the system, where Xi(k) 
follows a stationary stochastic process with mean Xi and Xi(k) is known at the beginning of 
requirement cycle k. 

Correspondingly in an ad assignment matrix M, the first n rows and the subsequent n rows 
represent the n long-term clients and the h types of short-term clients, respectively. If short- 
term type j is assigned to some webpage slot, one out of Xj(k) clients of this type is chosen 
uniformly at random due to their homogeneity. 

Additionally, for a short-term client of type i, the algorithm is actually aimed to satisfy at 
least only (1 — where on E [0, 1] is called "unfulfilled rate" for clients of type i and 

to be determined by the algorithm. A strictly convex and monotonically increasing function 
4>{a.j) E [0, oo) is then introduced to measure the "unhappiness" of short-term clients about 
unfulfilled impression requirements, and deducted from the original objective function "overall 
average click-through rate" in (|27T) after scaled by some predetermined weight W{ which reflects 
the importance of the new metric "unfulfilled rate." 

The second extension from the original model is to consider a more general query arrival 
pattern. We introduce a new time scale "stationary-arrival period" between the fast one "time slot" 
t and the slow one "requirement cycle" k, namely one requirement cycle equals H stationary- 
arrival periods (assuming that N/H E Z + and usually N/H 3> 1), and we assume that query 
arrivals with respect to each keyword q form a stationary stochastic process with rate v q (h) 
within the h th stationary-arrival period in one requirement cycle for all 1 < h < H. This is a 
more reasonable assumption for the query arrival pattern in the real Internet. For example, in one 
day, the query arrivals are stationary within each individual hour, non-stationary across different 
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hours, and stationary in the same hour across different days. This corresponds to H = 24, 
although setting a contract term (already assumed to be equal to one requirement cycle) as one 
day would only be a simplification for ease of exposition. Based on this example, in the following 
text we are going to use "day" and "hour" instead of "requirement cycle" and "stationary-arrival 
period" to better describe the basic ideas. 

In summary, the new optimization problem is formulated as 

. H n+h 

r ,s?3? , 7? Y Y u ^ h )Y *v(*o Y Mis °i is - Y w ^( a i) 

1 v q h=l MeM q l<i<n+h,s i=n+l 

subject to 

YY^Y p ^Y M ™- 



q h=l M&M q s 



and 



< PqM (h) < 1, Vg, M e M q , 1 < h < H; Jjv(/0 < 1, Vg, \<h<H. 

MeM q 

The only modification in the online algorithm described in Subsection is to add the following 
two steps specially for each type of short-term clients: 
. At the beginning of the k th day, update 

'liXi(k) ■ Qi(k)' 



a*(k) = >ip 



Hwi 



which corresponds to the target "unfulfilled rate" for each type of short-term clients in this 
day. Here, the function ip = [^] ~ . 
• At the end of the k th day, "credit queue" i maintained for type i of short-term clients is 
updated as 

Qi(k + 1) = g i (fc) + (l-a*(A;))-^(fc)-^(fc,Q(fc) ) q(A;)), 

where Si(k, Q(&),q(A;)) is defined in (1331) . 

The conclusions and proofs about near-optimality of the objective value, queueing stability and 
upper bound on the expected queue length are similar as those shown for the original problem 
in Subsection IV-AI and hence omitted here. 

Note that the online algorithm is still "oblivious" to the query arrivals, when the arrival 
processes become non- stationary to some extent. This is an artifact of dual decomposition w.r.t. 
each hour h, in addition to a decomposition w.r.t. each keyword q as we have seen before. 



